首页> 外文OA文献 >SAM: String-based sequence search algorithm for mitochondrial DNA database queries
【2h】

SAM: String-based sequence search algorithm for mitochondrial DNA database queries

机译:SAM:用于线粒体DNA数据库查询的基于字符串的序列搜索算法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The analysis of the haploid mitochondrial (mt) genome has numerous applications in forensic and population genetics, as well as in disease studies. Although mtDNA haplotypes are usually determined by sequencing, they are rarely reported as a nucleotide string. Traditionally they are presented in a difference-coded position-based format relative to the corrected version of the first sequenced mtDNA. This convention requires recommendations for standardized sequence alignment that is known to vary between scientific disciplines, even between laboratories. As a consequence, database searches that are vital for the interpretation of mtDNA data can suffer from biased results when query and database haplotypes are annotated differently. In the forensic context that would usually lead to underestimation of the absolute and relative frequencies. To address this issue we introduce SAM, a string-based search algorithm that converts query and database sequences to position-free nucleotide strings and thus eliminates the possibility that identical sequences will be missed in a database query. The mere application of a BLAST algorithm would not be a sufficient remedy as it uses a heuristic approach and does not address properties specific to mtDNA, such as phylogenetically stable but also rapidly evolving insertion and deletion events. The software presented here provides additional flexibility to incorporate phylogenetic data, site-specific mutation rates, and other biologically relevant information that would refine the interpretation of mitochondrial DNA data. The manuscript is accompanied by freeware and example data sets that can be used to evaluate the new software (http://stringvalidation.org).
机译:单倍体线粒体(mt)基因组的分析在法医和群体遗传学以及疾病研究中具有许多应用。尽管通常通过测序确定mtDNA单倍型,但很少报道它们为核苷酸串。传统上,它们相对于第一个测序的mtDNA的校正版本以基于差异编码的位置形式显示。该惯例要求有关标准化序列比对的建议,已知该建议在科学学科之间,甚至在实验室之间也会有所不同。结果,当对查询和数据库单倍型进行不同注释时,对于解释mtDNA数据至关重要的数据库搜索可能会出现偏差结果。在法医方面,这通常会导致绝对频率和相对频率的低估。为了解决此问题,我们介绍了SAM,这是一种基于字符串的搜索算法,该算法将查询和数据库序列转换为无位置的核苷酸字符串,从而消除了在数据库查询中丢失相同序列的可能性。仅使用BLAST算法是不够的,因为它使用启发式方法,并且无法解决mtDNA特有的属性,例如系统发育稳定但插入和删除事件也迅速发展。此处提供的软件提供了更大的灵活性,可以整合系统发育数据,位点特异性突变率和其他生物学相关信息,从而完善线粒体DNA数据的解释。该手稿随附免费软件和示例数据集,可用于评估新软件(http://stringvalidation.org)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号